Bioinformatics of Brain Diseases
203
FIGURE 8.2
Total number of experiments in the GEO repository for brain diseases and
disorders and total number of genes associated with the studied diseases and
disorders from the DisGeNET database (As of August 2023).
biological information. Most manufacturers provide analysis tools along their
microarray or RNA-seq products. However, there are also open-source tools
that include various methods in analyzing the data.
Bioconductor is an open-source software based on R programming lan-
guage that helps analyze genomic data (both microarray and RNA-seq) gen-
erated by wet lab experiments (https://www.bioconductor.org) [23].
It is
essentially a repository of R packages. There are currently 3593 packages in
its environment which are mostly software packages (2230) but there are also
annotation (912) and experimental data (421) packages as well as workflow
packages (30) (as of August 2023). Here, we can find genomic data analy-
sis packages like LIMMA (linear models for microarray data), an algorithm
that uses RMA (Robust Multi-array Average) and other normalization tech-
niques to account for data noise before using a linear model to determine the
differential expression of genes [24]. In addition to LIMMA, there are other
packages in Bioconductor that are used in analyzing RNA-seq and microarray
data such as EdgeR and DESeq2 [25]. EdgeR is a package that uses a Poisson
model to include both biological and technical variations [26]. Shrinkage esti-
mation for dispersions and fold changes are used in the DESeq2 approach to
improve estimate stability and perception [27]. As previously stated, these and
other packages implement a variety of statistical methodologies for differential
analysis. Once the analysis results are out there, we can identify significant
differential expression through the use of several cut offs such as p-value’s and
fold changes.